50 research outputs found

    Building a P2P RDF Store for Edge Devices

    Full text link
    The Semantic Web technologies have been used in the Internet of Things (IoT) to facilitate data interoperability and address data heterogeneity issues. The Resource Description Framework (RDF) model is employed in the integration of IoT data, with RDF engines serving as gateways for semantic integration. However, storing and querying RDF data obtained from distributed sources across a dynamic network of edge devices presents a challenging task. The distributed nature of the edge shares similarities with Peer-to-Peer (P2P) systems. These similarities include attributes like node heterogeneity, limited availability, and resources. The nodes primarily undertake tasks related to data storage and processing. Therefore, the P2P models appear to present an attractive approach for constructing distributed RDF stores. Based on P-Grid, a data indexing mechanism for load balancing and range query processing in P2P systems, this paper proposes a design for storing and sharing RDF data on P2P networks of low-cost edge devices. Our design aims to integrate both P-Grid and an edge-based RDF storage solution, RDF4Led for building an P2P RDF engine. This integration can maintain RDF data access and query processing while scaling with increasing data and network size. We demonstrated the scaling behavior of our implementation on a P2P network, involving up to 16 nodes of Raspberry Pi 4 devices.Comment: Accepted to IoT Conference 202

    Semantic Programming for Device-Edge-Cloud Continuum

    Full text link
    This position paper presents ThothSP, a Semantic Programming framework with the aim of lowering the coding effort in building smart applications on the Device-Edge-Cloud continuum by leveraging semantic knowledge. It introduces a novel neural-symbolic stream fusion mechanism, which enables the specification of data fusion pipelines via declarative rules, with degrees of learnable probabilistic weights. Moreover, it includes an adaptive federator that allows the Thoth>runtime to be distributed across multiple compute nodes in a network, and to coordinate their resources to collaboratively process tasks by delegating partial workloads to their peers. To demonstrate ThothSP's capability, we report a case study on a distributed camera network to show ThothSP's behaviour against a traditional edge-cloud setup.Comment: arXiv admin note: text overlap with arXiv:2202.1395

    Special Issue on High-Level Declarative Stream Processing

    Get PDF
    Stream processing as an information processing paradigm has been investigated by various research communities within computer science and appears in various applications: realtime analytics, online machine learning, continuous computation, ETL operations, and more. The special issue on "High-Level Declarative Stream Processing" investigates the declarative aspects of stream processing, a topic of undergoing intense study. It is published in the Open Journal of Web Technologies (OJWT) (www.ronpub.com/ojwt). This editorial provides an overview over the aims and the scope of the special issue and the accepted papers

    Provenance Management over Linked Data Streams

    Get PDF
    Provenance describes how results are produced starting from data sources, curation, recovery, intermediate processing, to the final results. Provenance has been applied to solve many problems and in particular to understand how errors are propagated in large-scale environments such as Internet of Things, Smart Cities. In fact, in such environments operations on data are often performed by multiple uncoordinated parties, each potentially introducing or propagating errors. These errors cause uncertainty of the overall data analytics process that is further amplified when many data sources are combined and errors get propagated across multiple parties. The ability to properly identify how such errors influence the results is crucial to assess the quality of the results. This problem becomes even more challenging in the case of Linked Data Streams, where data is dynamic and often incomplete. In this paper, we introduce methods to compute provenance over Linked Data Streams. More specifically, we propose provenance management techniques to compute provenance of continuous queries executed over complete Linked Data streams. Unlike traditional provenance management techniques, which are applied on static data, we focus strictly on the dynamicity and heterogeneity of Linked Data streams. Specifically, in this paper we describe: i) means to deliver a dynamic provenance trace of the results to the user, ii) a system capable to execute queries over dynamic Linked Data and compute provenance of these queries, and iii) an empirical evaluation of our approach using real-world datasets

    EAGLE—A Scalable Query Processing Engine for Linked Sensor Data

    Get PDF
    Recently, many approaches have been proposed to manage sensor data using semantic web technologies for effective heterogeneous data integration. However, our empirical observations revealed that these solutions primarily focused on semantic relationships and unfortunately paid less attention to spatio–temporal correlations. Most semantic approaches do not have spatio–temporal support. Some of them have attempted to provide full spatio–temporal support, but have poor performance for complex spatio–temporal aggregate queries. In addition, while the volume of sensor data is rapidly growing, the challenge of querying and managing the massive volumes of data generated by sensing devices still remains unsolved. In this article, we introduce EAGLE, a spatio–temporal query engine for querying sensor data based on the linked data model. The ultimate goal of EAGLE is to provide an elastic and scalable system which allows fast searching and analysis with respect to the relationships of space, time and semantics in sensor data. We also extend SPARQL with a set of new query operators in order to support spatio–temporal computing in the linked sensor data context.EC/H2020/732679/EU/ACTivating InnoVative IoT smart living environments for AGEing well/ACTIVAGEEC/H2020/661180/EU/A Scalable and Elastic Platform for Near-Realtime Analytics for The Graph of Everything/SMARTE

    Pushing the Scalability of RDF Engines on IoT Edge Devices

    Get PDF
    Semantic interoperability for the Internet of Things (IoT) is enabled by standards and technologies from the Semantic Web. As recent research suggests a move towards decentralised IoT architectures, we have investigated the scalability and robustness of RDF (Resource Description Framework)engines that can be embedded throughout the architecture, in particular at edge nodes. RDF processing at the edge facilitates the deployment of semantic integration gateways closer to low-level devices. Our focus is on how to enable scalable and robust RDF engines that can operate on lightweight devices. In this paper, we have first carried out an empirical study of the scalability and behaviour of solutions for RDF data management on standard computing hardware that have been ported to run on lightweight devices at the network edge. The findings of our study shows that these RDF store solutions have several shortcomings on commodity ARM (Advanced RISC Machine) boards that are representative of IoT edge node hardware. Consequently, this has inspired us to introduce a lightweight RDF engine, which comprises an RDF storage and a SPARQL processor for lightweight edge devices, called RDF4Led. RDF4Led follows the RISC-style (Reduce Instruction Set Computer) design philosophy. The design constitutes a flash-aware storage structure, an indexing scheme, an alternative buffer management technique and a low-memory-footprint join algorithm that demonstrates improved scalability and robustness over competing solutions. With a significantly smaller memory footprint, we show that RDF4Led can handle 2 to 5 times more data than popular RDF engines such as Jena TDB (Tuple Database) and RDF4J, while consuming the same amount of memory. In particular, RDF4Led requires 10%–30% memory of its competitors to operate on datasets of up to 50 million triples. On memory-constrained ARM boards, it can perform faster updates and can scale better than Jena TDB and Virtuoso. Furthermore, we demonstrate considerably faster query operations than Jena TDB and RDF4J.BMBF, 01IS18025A, Verbundprojekt BIFOLD-BBDC: Berlin Institute for the Foundations of Learning and DataBMBF, 01IS18037A, Verbundprojekt BIFOLD-BZML: Berlin Institute for the Foundations of Learning and DataEC/H2020/661180/EU/A Scalable and Elastic Platform for Near-Realtime Analytics for The Graph of Everything/SMARTE

    VisionKG: Unleashing the Power of Visual Datasets via Knowledge Graph

    Full text link
    The availability of vast amounts of visual data with heterogeneous features is a key factor for developing, testing, and benchmarking of new computer vision (CV) algorithms and architectures. Most visual datasets are created and curated for specific tasks or with limited image data distribution for very specific situations, and there is no unified approach to manage and access them across diverse sources, tasks, and taxonomies. This not only creates unnecessary overheads when building robust visual recognition systems, but also introduces biases into learning systems and limits the capabilities of data-centric AI. To address these problems, we propose the Vision Knowledge Graph (VisionKG), a novel resource that interlinks, organizes and manages visual datasets via knowledge graphs and Semantic Web technologies. It can serve as a unified framework facilitating simple access and querying of state-of-the-art visual datasets, regardless of their heterogeneous formats and taxonomies. One of the key differences between our approach and existing methods is that ours is knowledge-based rather than metadatabased. It enhances the enrichment of the semantics at both image and instance levels and offers various data retrieval and exploratory services via SPARQL. VisionKG currently contains 519 million RDF triples that describe approximately 40 million entities, and are accessible at https://vision.semkg.org and through APIs. With the integration of 30 datasets and four popular CV tasks, we demonstrate its usefulness across various scenarios when working with CV pipelines

    Enabling IoT ecosystems through platform interoperability

    Get PDF
    Today, the Internet of Things (IoT) comprises vertically oriented platforms for things. Developers who want to use them need to negotiate access individually and adapt to the platform-specific API and information models. Having to perform these actions for each platform often outweighs the possible gains from adapting applications to multiple platforms. This fragmentation of the IoT and the missing interoperability result in high entry barriers for developers and prevent the emergence of broadly accepted IoT ecosystems. The BIG IoT (Bridging the Interoperability Gap of the IoT) project aims to ignite an IoT ecosystem as part of the European Platforms Initiative. As part of the project, researchers have devised an IoT ecosystem architecture. It employs five interoperability patterns that enable cross-platform interoperability and can help establish successful IoT ecosystems.Peer ReviewedPostprint (author's final draft

    The SSN ontology of the W3C semantic sensor network incubator group

    Get PDF
    The W3C Semantic Sensor Network Incubator group (the SSN-XG) produced an OWL 2 ontology to describe sensors and observations ? the SSN ontology, available at http://purl.oclc.org/NET/ssnx/ssn. The SSN ontology can describe sensors in terms of capabilities, measurement processes, observations and deployments. This article describes the SSN ontology. It further gives an example and describes the use of the ontology in recent research projects

    Best Practices for Publishing, Retrieving, and Using Spatial Data on the Web

    Get PDF
    Data owners are creating an ever richer set of information resources online, and these are being used for more and more applications. With the rapid growth of connected embedded devices, GPS-enabled mobile devices, and various organizations that publish their location-based data (i.e., weather and traffic services), maps and geographical and spatial information (i.e., GIS and open maps), spatial data on the Web is becoming ubiquitous and voluminous. However, the heterogeneity of the available spatial data, as well as some challenges related to spatial data in particular make it difficult for data users, web applications and services to discover, interpret and use the information in large and distributed web systems. This paper summarizes some of the efforts that have been undertaken in the joint W3C/OGC Working Group on Spatial Data on the Web, in particular the effort to describe the best practices for publishing spatial data on the Web. This paper presents the set of principles that guide the selection of these best practices, describes best practices that are employed to enable publishing, discovery and retrieving (querying) this type of data on the Web, and identifies some areas where a best practice has not yet emerged
    corecore